Pitchbook Data Exploration

Author

Inder Majumdar

Published

February 3, 2026

The Pitchbook data provides useful aggregte statistics at the state level. Data include number of VC deals and number of dollars invested (What I call “capital committed”) per year.

We’re trying to explain an odd discrepency - when we look at capital committed or dealcount, Wisconsin lags other states and categories. However, when we look at “per 1 million population” figures, the pattern flips. The goal is to better understand what years might be driving this, and come up with a normalization that “makes sense.”

We’ll redact most of the wording below when communicating to a general audience. However, the purpose of this document is to make clear how each figure has been made thus far and suggest some alternatives that might make more sense.

Raw Numbers: How does Wisconsin Fare w.r.t. Capital Committed, VC Dealcount? How we got here

These were among the original figures we put together. To save time, I’ll show you the “aggregated” figures we landed on before trying to normalize.

My understanding of our initial reactions are as follows: 1) Given the economic composition of WI, the above figures make sense. Tech and other VC-heavy sectors don’t comprise as large a portion of the state economy as in other states. 2) Wisconsin has a higher percentage of its population living in rural areas than the average state - not accounting for this might unfairly skew the measure (a flavor of omitted variable bias).

We next decided to look at normalizing by a measure of population (see next section).

Normalization #1: Per 1 M Population

Normalizations are calculated in the aggregate as follows:

\[ Y_{agg,i} = \frac{\sum_{t}{\sum_{s \in i}{y_{s,t}}}}{\sum_t{\sum_{s \in i}{pop_{s,t}}}} \]

In words, I separately sum the numerator and denominator, then take a ratio. In the numerator, I first sum the numerator across states to create groups (e.g. National, Big 3) and then sum across years to get the aggregate. I follow the same procedure for the denominator before dividing. \(pop\) is defined by taking the BLS/LAUS county-level employment statistics and dividing by the participation rate to recover an estimate of the “work-eligible” population.

What concerned us here was the dramatic swing in pattern - now Wisconsin dwarfs the comparison groups.

Most recent year, normalization 1:

Next I’ll break down VC deal count and capital committed by state.

Interactive: Deal Count

Interactive Comparison to bottom 10

I think I figured out our problem. Consider the lowest five states by deal count by year:

How do my original comparison groups look over time (interactive)?

Fix to Normalization 1

I think what’s happening is a weird version of Jensen’s inequality. States with high populations relative to VC activity weigh down the metric. To Tessa’s point, my equation above doesn’t really give us a typical apples-to-apples comparison. So instead, let’s change the metric to:

\[ Y_{agg,i} = \sum_{t} \sum_{s \in i} \frac{y_{s,t}}{pop_{s,t}} \]

So then each comparison group is constructed as an “average state.”

APPENDIX Normalization 2: Per 1,000 Businesses

Following the formula above, we then tried a different normalization: We normalized each statistic to be “per 1,000 businesses using county-level BFS (Business Formation Statistics) from the Census.